A Regularity Measure for Context Free Grammars
نویسنده
چکیده
Parikh’s theorem states that every Context Free Language (CFL) has the same Parikh image as that of a regular language. A finite state automaton accepting such a regular language is called a Parikh-equivalent automaton. In the worst case, the number of states in any non-deterministic Parikh-equivalent automaton is exponentially large in the size of the Context Free Grammar (CFG). We associate a regularity width d with a CFG that measures the closeness of the CFL with regular languages. The degree m of a CFG is one less than the maximum number of variable occurrences in the right hand side of any production. Given a CFG with n variables, we construct a Parikh-equivalent non-deterministic automaton whose number of states is upper bounded by a polynomial in n(d), the degree of the polynomial being a small fixed constant. Our procedure is constructive and runs in time polynomial in the size of the automaton. In the terminology of parameterized complexity, we prove that constructing a Parikh-equivalent automaton for a given CFG is Fixed Parameter Tractable (Fpt) when the degree m and regularity width d are parameters. We also give an example from program verification domain where the degree and regularity are small compared to the size of the grammar.
منابع مشابه
On the Synchronized Derivation Depth of Context-Free Grammars
We consider depth of derivations as a complexity measure for synchronized and ordinary context-free grammars. This measure differs from the earlier considered synchronization depth in that it counts the depth of the entire derivation tree. We consider (non-)existence of trade-offs when using synchronized grammars as opposed to non-synchronized grammars and establish lower bounds for certain cla...
متن کاملStudying impressive parameters on the performance of Persian probabilistic context free grammar parser
In linguistics, a tree bank is a parsed text corpus that annotates syntactic or semantic sentence structure. The exploitation of tree bank data has been important ever since the first large-scale tree bank, The Penn Treebank, was published. However, although originating in computational linguistics, the value of tree bank is becoming more widely appreciated in linguistics research as a whole. F...
متن کاملComparing the Ambiguity Reduction Abilities of Probabilistic Context-Free Grammars
We present a measure for evaluating Probabilistic Context Free Grammars (PCFG) based on their ambiguity resolution capabilities. Probabilities in a PCFG can be seen as a filtering mechanism: For an ambiguous sentence, the trees bearing maximum probability are single out, while all others are discarded. The level of ambiguity is related to the size of the singled out set of trees. Under our meas...
متن کاملStructure Preserving Transformations on Non-Left-Recursive Grammars (Preliminary Version)
1. INTROOUCTION ANO PRELIMINARIES If a context-free grammar is transformed to another context-free grammar in most of the cases it is quite obvious to demand weak equivalence for these two grammars. Transformations on context-free grammars can be defined for several reasons. Oepen-dent on these reasons one may be interested in stronger relations of grammatical similarity. Instead of arbitrary c...
متن کاملThe Ancestor Width of Grammars and Languages
The ancestor width is a new measure for the structure of derivations of arbitrary grammars. For every production used in a derivation or equivalently for every leaf we consider the strings of ancestors. The ancestors deene a complexity measure with a local avour. Obviously , context-free grammars have ancestor width one. We show that languages with ancestor width two are context-free. However, ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1109.5615 شماره
صفحات -
تاریخ انتشار 2011